In this tutorial, you will learn some basic steps to start working with (geo)-spatial data in R. I prepared this tutorial as a intuitive “hands on” introduction, but I provide links for those interested in more background and theory.
We will cover following steps :
sf
ggplotmapviewI’ve preloaded the packages for this tutorial with
library(sf)
library(leaflet)
library(mapview)
library(leafpop)
library(units)
Throughout this tutorial we will be using the pipe operator, %>%. You can use the pipe to rewrite multiple operations in a way that you can read left-to-right, top-to-bottom. We’ll use piping frequently because it considerably improves the readability of code. The pipe is a defining feature of the tidyverse, so we will load this set of packages as well.
library(tidyverse)
The tidyverse package also includes ggplot2, which allow us to create static maps in a similar modular fashion, but using the + operator instead of the pipe.
For the first part of this tutorial we will create our own spatial objects. Further on, to practice more spatial functions, we will use the franconia, breweries and trails data sets from package mapview. These data sets are documented in help(package=mapview).
I learnt to use R for statistical analysis when I was a student in the University of Bayreuth in Germany in the year 2000.
At that time I needed to create a siple map for my study area, and I though R could be a cool tool for that, how hard could it be?
So I gave it a try at mapping using just the basic plotting functions:
My first “map” in R, using basic plotting functions
At that time, there were not many options to work with spatial data in R, and the few available packages required ad hoc formats and had limited numbers of functions for very specialised tasks.
Fortunately a group of R users and developers started to tackle the issue of integrating Geographical Information Systems (GIS) into R.
The group of R packages in rspatial allowed users to:
More recently, the r-spatial group of packages (mind the “-” in the name…) updated, standardized and modernized many of these packages by: - adoption of open standards: “simple features” with sf package - handling of spatiotemporal arrays (Data cubes) with stars package
Also, the development of powerful external libraries and services allow better visualisation and analysis in R. Some options include:
Visit the CRAN Task View: Analysis of Spatial Data for a comprehensive list of packages and links to many more resources.
Nowadays, there are plenty of options for working with spatial data in R:
Now that we have discussed some background on packages for spatial data in R, let’s start with our first spatial object.
For the first part of this tutorial we will create our own spatial objects.
sfThere are many alternative ways to define and work with geo-spatial data in R, but the simple features approach in package sf has many advantages and has become the new standard.
Simple features are a standardized way to encode spatial vector data. This package binds to several external libraries like GDAL, GEOS and PROJ for many spatial operations.
If you want to understand simple features in R, you should visit the documentation of the sf package.
Functions from package sf have the prefix st_. Many of these functions are described by the simple features standard, so, if you are familiar with spatial databases such as PostGIS and other GIS software, you might recognise many of these.
A simple feature needs to have a geometry, this can be a point, multi-point, linestring, multi-linestring, polygon, multi-polygon, it can also be linked to a collection of several geometries.
In R you can use this function to define a point geometry:
my_geom <- st_point(c(22.431028, 37.075074))
For geospatial data we need to define a coordinate reference system (crs) to interpret these coordinates correctly. They are usually defined by several parameters in a standard format (with heavy use of geospatial jargon), but there are handy shortcuts for the most common examples. Here we will focus on plain latitude and longitude coordinates, known as the latlong projection based on the WGS84 datum, this is also referred to as EPSG:4326 (Check details at https://epsg.io/4326)
my_crs <- st_crs("+proj=latlong +datum=WGS84")
Typically you will want to attach one or more attributes or variables to your geometry, for example we can start with a text, and maybe a url link in a data frame (or a tibble):
data <- data.frame(text='This is Sparta!',
url='https://64.media.tumblr.com/c963fce8a6e638323f5e60df33c127f3/tumblr_mq3bykOEmD1s9xi1so1_500.gif')
But you are not limited by this, you can have a data frame with many columns, each column can represent a variable, a measurement, you can have factors, texts, colors, etc.
Each row in the data frame represents an observation or element, so they need to match the number of spatial features. In this case it is easy with just one point and one row in the data frame, but keep this in mind when creating your own objects!
So we are now ready to make our simple feature object,
One way to put this together and create a simple feature with function st_sf is this:
sparta <- st_sf(data,
crs=my_crs,
geometry=st_sfc(my_geom, crs=my_crs))
Notice that I am using st_sfc to combine the geometry and crs into a simple feature geometry list column.
If we want to combine several points into one simple feature object, and assuming all share the same crs:
pt1 <- st_point(c(23.978333, 38.118056))
pt2 <- st_point(c( 23.2, 39.05))
pt3 <- st_point(c(20.716667, 38.366667))
data <- data.frame(name=c('Marathon','Arthemision','Ithaca'), famous_for=c('Battle of Marathon','Battle of Arthemision',"Homer's Odyssey"), url=c('https://en.wikipedia.org/wiki/Battle_of_Marathon','https://en.wikipedia.org/wiki/Battle_of_Artemisium','https://en.wikipedia.org/wiki/Ithaca'), year=c(-490,-480,-800))
not_sparta <- st_sf(data,
crs=my_crs,
geometry=st_sfc(pt1, pt2,pt3))
This object has many more attributes than your usual data frame:
not_sparta
## Simple feature collection with 3 features and 4 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 20.71667 ymin: 38.11806 xmax: 23.97833 ymax: 39.05
## CRS: +proj=latlong +datum=WGS84
## name famous_for
## 1 Marathon Battle of Marathon
## 2 Arthemision Battle of Arthemision
## 3 Ithaca Homer's Odyssey
## url year
## 1 https://en.wikipedia.org/wiki/Battle_of_Marathon -490
## 2 https://en.wikipedia.org/wiki/Battle_of_Artemisium -480
## 3 https://en.wikipedia.org/wiki/Ithaca -800
## geometry
## 1 POINT (23.97833 38.11806)
## 2 POINT (23.2 39.05)
## 3 POINT (20.71667 38.36667)
More important, this object has all we need to display information on a map:
mapview(sparta, popup = popupImage(img=sparta$url, src ="remote")) + mapview(not_sparta, col.regions = "blanchedalmond")
Now that we know wherelet’s create our own simple feature object for this tutorial.
We want to define a point for the University of Bayreuth (Universität Bayreuth) in Franconia, Germany with the supplied coordinates and crs, fill the blanks and hit run
#my_data <- ____(full_name='Universität Bayreuth',
# url='https://uni-bayreuth.de/')
#my_crs <- ____("EPSG:4326")
#my_geom <- ____(c(11.585833, 49.928889))
#UBT <- st_sf(my_data,
# crs=my_crs,
# geometry=____)
my_data <- data.frame(full_name='Universität Bayreuth',
url='https://uni-bayreuth.de/')
my_crs <- st_crs("EPSG:4326")
my_geom <- st_point(c(11.585833, 49.928889))
UBT <- st_sf(my_data,
crs=my_crs,
geometry=st_sfc(my_geom))
Hint: You need to define the data and the simple feature collection to create your object. Use functions st_point, st_crs and st_sfc.
We just created our first spatial object, how cool is that?
However, spatial data are often more complex and we want to use R to answer questions about the spatial data, not only display them.
In the next section we will find out some ways to do that.
sfFor the rest of the tutorial we want to explore the surroundings of the University of Bayreuth, my alma mater.
There is not much we can do with one point, so we will look at a couple of data sets from this region, known as “Franken” in german or “Franconia” in other languages.
The best way to start our tour is to explore one of its main attractions: the breweries or Biergärten.
If you want to have an overview of a sf object in R, just type its name, click on the run button to show data and attributes of this object:
breweries
This is basically the same structure we have seen before, but with more points and several variables.
If you get a warning about and old-style crs object, you can simply update this with:
st_crs(breweries) <- "EPSG:4326"
We can use the function st_drop_geometry to look at the data without the spatial component:
breweries %>% st_drop_geometry()
st_geometry to look at the spatial component without the data:
breweries %>% st_geometry()
There are many ways to explore data in spatial objects. We might be interested in non-spatial information, for example, how many types of beers are there in each breweries?
breweries %>% pull(number.of.types) %>% hist(main="Nr. of types of beers")
But how do we know where are the breweries with the most types of beers?
Here we will select one variable from our object and use the defaul plot function for sf objects:
breweries %>% select(number.of.types) %>% plot(pch=19)
So, where would you go to have the largest selection of beers?
You can imagine this is a question that was frequently asked in academic circles of the University. So let’s find the answer!
First we will need to define what is the desired number of beers we expect. We can do this by filtering our spatial object based on a variable:
selected_breweries <- breweries %>% filter(number.of.types>9)Then we want to know which one of these is nearest to our
qry <- st_nearest_feature(UBT,selected_breweries)Now we can check the details for this record in the spatial object:
selected_breweries %>% slice(qry)Finally we want to know the distance from the uni:
st_distance(UBT,selected_breweries %>% slice(qry))Try running these lines of code, and see if you can answer the questions below:
selected_breweries <- breweries %>% filter(number.of.types>9)
qry <- st_nearest_feature(UBT,selected_breweries)
selected_breweries %>% slice(qry)
st_distance(UBT,selected_breweries %>% slice(qry))
We learned how to plot and query spatial and non-spatial information from a sf object, well done!
We will come back to the questions of plotting and querying spatial objects when we have a look at other types of spatial objects. Please continue to the next section.
sfSo we know the basics of a simple feature object with points, and we have seen some simple ways to interact with this object. We will now learn that this structure will be very similar for other kind of geometries and this makes working with multiple objects very straightforward.
Let’s check one object that is already in one of the loaded data sets, it contains the boundaries of the municipalities of Franconia.
We will update the crs information to avoid warnings about an old-style crs object:
st_crs(franconia) <- "EPSG:4326"
Check the information of this object, what is different from previous examples? what is similar?:
franconia
All sf objects have some common structure, regardless of how complex the spatial information is. Theinteresting feature here is that the geometry is composed of several polygons that can be associated with a single record in the data frame. Many spatial objects are nested, and thus multiple elements (points, lines, polygons) can share the same attributes. The multi-type of geometries allow to handle this kind of data in the same way as we would handle single points, lines or polygons.
We have seen before that we can query both the geometry and the non spatial data of sf objects.
Let’s plot the geometry without any attributes:
franconia %>% st_geometry() %>% plot
Or plot ALL attributes. (This can be kind of annoying for very large and complex objects…)
franconia %>% plot
# franconia %>% ____(district) %>% ____
franconia %>% select(district) %>% plot
We can go one step further and dissolve the boundaries of adjacent municipalities of the same district, this will give us a new spatial object with three features:
franconia %>% group_by(district) %>% summarise(new_geom=st_union(geometry)) %>% plot
Notice how we can use spatial functions like st_union with other functions like group_by and summarise.
If all our data is in the same projection (sharing the same or similar crs) we can perform spatial queries, for example, we want to extract the intersection of the location of the University of Bayreuth (in object UBT) with the Franconia municipalities (in object franconia)
st_intersection(UBT,franconia)
The result is a point geometry with all variables from the UBT and the franconia data sets.
Simple feature object have their own plot functions, but I find it easier to combine different spatial objects with the ggplot2 functions.
In ggplot, each layer of plot elements is handled by a geom_ function, for sf object the function is called geom_sf.
We can combine the geom_sf calls with other ggplot functions to modify colors, etc.
ggplot() +
geom_sf(data = franconia, aes(fill=district)) +
geom_sf(data = breweries, col="darkgreen") +
geom_sf(data = UBT, cex=2, col="brown") +
scale_fill_brewer(palette = "Greys")

Better yet, in ggplot we can use variables in different aesthetic elements of the plot, for example, colour or size:
ggplot() +
geom_sf(data = franconia, aes(fill=district)) +
geom_sf(data = breweries, aes(size=number.of.types, colour=founded)) +
geom_sf(data = UBT, cex=2, col="brown") +
scale_fill_brewer(palette = "Greys") +
scale_colour_gradientn(colours = brewer.pal(5,"Oranges"))
## Warning: Removed 150 rows containing missing values (geom_sf).

We just learned to work with polygons, how to use spatial functions st_union, st_intersection and how to plot spatial objects with `ggplot, well done!
In the next section we will work with objects in different projections.
sfThis data set has a selection of hiking trails in Franconia, the geometry here is a multistring, which are collections of lines. Notice that in this case we are using a different EPSG code, it refers to a specific zone of the Universal Transverse Mercator projection for this region of the world.
trails
Again, if you see a warning about old-style crs, then reassign the same crs.
st_crs(trails) <- "EPSG:32632"
| The Universal Transverse Mercator (UTM) is a map projection system for assigning coordinates to locations on the surface of the Earth. The UTM system divides the Earth into 60 zones, each 6° of longitude in width, and uses a projection that can map a region of large north-south extent with low distortion. |
We are often interested in the spatial dimensions of object, for example in this case we want to know the length of each trail, we use the function st_length:
trails %>% st_length() %>% set_units('km') %>% hist(main="Length of trails")
Often you want to add information to your spatial object, so that you can use that information latter in your map or in other analysis, here we will use the function mutate to add the length as a variable of the trails object, and then find the longest trail for a hiking excursion.
trails %>% mutate(length=st_length(geometry)) -> trails
trails %>% slice(which.max(length))
We can also filter by a desired length and create objects for short and long trails:
trails %>% mutate(length=st_length(geometry)) %>% filter(length>set_units(3000,'m'),length<set_units(3400,'m')) -> short_trails
trails %>% mutate(length=st_length(geometry)) %>% filter(length>set_units(40000,'m')) -> long_trails
When plotting object, geom_sf will handle the projections automatically, in this case the trails object is projected to geographical coordinates (some people would call this “inverse projection”). Notice the graticule of geographical coordinates in the background is rectilinear (1 degree in latitude is equal to 1 degree in longitude)?
ggplot() +
geom_sf(data = franconia, aes(fill=district)) +
geom_sf(data = UBT, cex=2, col="yellow") +
geom_sf(data = long_trails, col="whitesmoke")

If we want to have all our data in the UTM projection, we could invert the order in which we call the objects:
ggplot() +
geom_sf(data=long_trails, col="whitesmoke") +
geom_sf(data = franconia , aes(fill=district)) +
geom_sf(data=UBT , cex=2, col="yellow")
How does this look? Do you notice the graticule behind our object has changed?
Unfortunately the lines are overlaid with polygons and we can not see them. So this is NOT the best way to do this.
If we want to have all our data projected on the fly, we can add a call to coord_sf and specify our choice of crs:
ggplot() +
geom_sf(data = franconia, aes(fill=district)) +
geom_sf(data = UBT, cex=2, col="yellow") +
geom_sf(data = long_trails, col="whitesmoke") +
coord_sf(crs=st_crs(long_trails))
However, if we want to work with the projected data more than once, we can save time by doing the transformation once and then using the projected objects instead of the originals.
Let’s try a projection, just hit the run button to see what this means:
(UBT_utm = UBT %>% st_transform(st_crs(trails)))
Can you do the same with the franconia dataset?
# franconia_utm = ____ %>% ____(____)
franconia_utm = franconia %>% st_transform(st_crs(trails))
Check how this look like
ggplot() +
geom_sf(data = franconia_utm, aes(fill=district)) +
geom_sf(data = UBT_utm, cex=2, col="yellow") +
geom_sf(data =long_trails, col="whitesmoke")

We learned to work with lines, how to use spatial functions st_length and how to handle projections of spatial objects with `ggplot and with the function st_transform. Great!
In the last section we will use mapview to explore all the types of data interactively.
mapviewMapview is …
Most of the examples here are based on the help-page and tutorial of mapview, I encourage you to explore their examples to learn more about the full range of options.
Mapview uses a similar syntax as ggplot, you can add components with the + operator:
mapview(franconia) + mapview(trails) + mapview(breweries) + mapview(UBT)
Navigation of a mapview element is quite intuitive
We can change the colors and other attributes of each layer
UBT_url <- "https://www.study-in-bavaria.de/fileadmin/_processed_/csm_Campus-Luftbild_1_qf_03_8a455a1f25.jpg"
mapview(franconia, zcol = "district", legend = TRUE) +
mapview(trails) +
mapview(breweries, zcol = "founded", at = seq(1400, 2200, 200), legend = TRUE) +
mapview(UBT, popup = popupImage(img=UBT_url, src ="remote"), col.regions = "blanchedalmond")
You can also combine several function calls to create maps of computed values for your spatial objects. This is an example for creating a map of density of breweries per municipality
franconia %>%
mutate(count = lengths(st_contains(., breweries)),
area = st_area(.) %>% set_units('km^2'),
density = count / area) %>%
mapview(zcol = "density")
We just learned to use mapview to visualise spatial data interactively.
I hope this has been useful and you feel more confident to start working with geo-spatial data in R.